Search CORE

24 research outputs found

Modular resource development and diagnostic evaluation framework for fast NLP system improvement

Author: de Chalendar Gaël
Nouvel Damien
Publication venue: HAL CCSD
Publication date: 31/05/2009
Field of study

Natural Language Processing systems are large-scale softwares, whose development involves many man-years of work, in terms of both coding and resource development. Given a dictionary of 110k lemmas, a few hundred syntactic analysis rules, 20k ngrams matrices and other resources, what will be the impact on a syntactic analyzer of adding a new possible category to a given verb? What will be the consequences of a new syntactic rules addition? Any modification may imply, besides what was expected, unforeseeable side-effects and the complexity of the system makes it difficult to guess the overall impact of even small changes. We present here a framework designed to effectively and iteratively improve the accuracy of our linguistic analyzer LIMA by iterative refinements of its linguistic resources. These improvements are continuously assessed by evaluating the analyzer performance against a reference corpus. Our first results show that this framework is really helpful towards this goal

HAL Université de Tours

HAL-CEA

Revisiting knowledge-based Semantic Role Labeling

Author: De Chalendar Gaël
Pradet Quentin
Pujol Guilhem
Publication venue: HAL CCSD
Publication date: 07/12/2013
Field of study

International audienceSemantic role labeling has seen tremendous progress in the last years, both for supervised and unsupervised approaches. The knowledge-based approaches have been neglected while they have shown to bring the best results to the related word sense disambiguation task. We contribute a simple knowledge-based system with an easy to reproduce specification. We also present a novel approach to handle the passive voice in the context of semantic role labeling that reduces the error rate in F1 by 15.7%, showing that significant improvements can be brought while retaining the key advantages of the approach: a simple approach which facilitates analysis of individual errors, does not need any hand-annotated corpora and which is not domain-specific

HAL-CEA

WoNeF : amélioration, extension et évaluation d'une traduction française automatique de WordNet

Author: Baguenier Desormeaux Jeanne
Danlos Laurence
De Chalendar Gaël
Pradet Quentin
Publication venue: HAL CCSD
Publication date: 17/06/2013
Field of study

National audienceIdentiﬁer les sens possibles des mots du vocabulaire est un problème difﬁcile demandant un travail manuel très conséquent. Ce travail a été entrepris pour l'anglais : le résultat est la base de données lexicale WordNet, pour laquelle il n'existe encore que peu d'équivalents dans d'autres langues. Néanmoins, des traductions automatiques de WordNet vers de nombreuses langues cibles existent, notamment pour le français. JAWS est une telle traduction automatique utilisant des dictionnaires et un modèle de langage syntaxique. Nous améliorons cette traduction, la complétons avec les verbes et adjectifs de WordNet, et démontrons la validité de notre approche via une nouvelle évaluation manuelle. En plus de la version principale nommée WoNeF, nous produisons deux versions supplémentaires : une version à haute précision (93% de précision, jusqu'à 97% pour les noms), et une version à haute couverture contenant 109 447 paires (littéral, synset)

INRIA a CCSD electronic archive server

HAL-CEA

Hal-Diderot

Adapting VerbNet to French using existing resources

Author: Danlos Laurence
De Chalendar Gaël
Pradet Quentin
Publication venue: HAL CCSD
Publication date: 28/05/2014
Field of study

International audienceVerbNet is an English lexical resource for verbs that has proven useful for English NLP due to its high coverage and coherent classification. Such a resource doesn’t exist for other languages, despite some (mostly automatic and unsupervised) attempts. We show how to semi-automatically adapt VerbNet using existing resources designed for different purposes. This study focuses on French and uses two French resources: a semantic lexicon (Les Verbes Français) and a syntactic lexicon (Lexique-Grammaire)

INRIA a CCSD electronic archive server

HAL-CEA

Hal-Diderot

Developing a French FrameNet: Methodology and First results

Author: Amsili Pascal
Barque Lucie
Benamara Farah
Candito Marie
De Chalendar Gaël
Djemaa Marianne
Haas Pauline
Huyghe Richard
Mathieu Yvette Yannick
Muller Philippe
Sagot Benoît
Vieu Laure
Publication venue: HAL CCSD
Publication date: 01/05/2014
Field of study

International audienceThe Asfalda project aims to develop a French corpus with frame-based semantic annotations and automatic tools for shallow semantic analysis. We present the ﬁrst part of the project: focusing on a set of notional domains, we delimited a subset of English frames, adapted them to French data when necessary, and developed the corresponding French lexicon. We believe that working domain by domain helped us to enforce the coherence of the resulting resource, and also has the advantage that, though the number of frames is limited (around a hundred), we obtain full coverage within a given domain

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Open Archive Toulouse Archive Ouverte

HAL-CEA

Hal-Diderot

Getting reliable answers by exploiting results from several sources of information

Author: Berthelin Jean-Baptiste
De Chalendar Gaël
Elkateb-Gara Faïza
Ferret Olivier
Grau Brigitte
Hurault-Plantet Martine
Illouz Gabriel
Monceaux Laura
Robba Isabelle
Vilnat Anne
Publication venue: HAL CCSD
Publication date: 01/01/2003
Field of study

International audienceA question-answering system will be more convincing if it can give the user elements concerning the reliability of its propositions. In order to address this problem, we chose to take the advice of several searches. First, we search for answers in a reliable document collection, and second, on the Web. When both sources of knowledge allow the system to find common answers, we are confident with it and boost them at the first places

Trouver des réponses dans le web et dans une collection fermée

Author: Berthelin Jean-Baptiste
De Chalendar Gaël
Elkateb-Gara Faïza
Ferret Olivier
Grau Brigitte
Hurault-Plantet Martine
Illouz Gabriel
Monceaux Laura
Robba Isabelle
Vilnat Anne
Publication venue: HAL CCSD
Publication date: 01/01/2003
Field of study

National audienceThe task of question answering, as defined in the TREC-11 evaluation, may rely on a Web search. However, this strategy is not a sufficient one, since Web results are not certified. Our system, QALC, searches both the Web and the AQUAINT text base. This implies that the system exists in two versions, each one of them dealing with one kind of resource. Particularly, Web requests may be extremely precise, and still be successful. Relying upon both kinds of search results yields a better ranking of the answers, hence a better functioning of the QALC system.La tâche de réponse à des questions, comme elle se présente dans le cadre de l'évaluation TREC-11, peut déclencher une recherche de la réponse en question sur le Web. Mais cette stratégie, à elle seule, ne garantit pas une bonne fiabilité de la réponse. Notre système, QALC, effectue donc une double recherche, sur le Web et sur la collection de référence AQUAINT. Cela suppose d'avoir deux versions du système, adaptées à ces deux ressources documentaires. En particulier, le Web peut être interrogé avec succès en gardant la question sous une forme extrêmement précise. Le fait de s'appuyer sur des résultats communs à ces deux recherches permet de mieux classer les réponses, et donc d'améliorer la performance du système QALC

Hal-Diderot

Intégration de la similarité entre phrases comme critère pour le résumé multi-document

Author: De Chalendar Gaël
Ferret Olivier
Mnasri Maâli
Publication venue: HAL CCSD
Publication date: 01/01/2016
Field of study

National audienc

HAL-CEA